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Abstract 

Passive network tomography uses end-to-end observations of network communication to characterize the net- 
work, for instance to estimate the network topology and to locaUze random or adversarial glitches. Under the 
setting of linear network coding this work provides a comprehensive study of passive network tomography in the 
presence of network (random or adversarial) gUtches. To be concrete, this work is developed along two directions: 
1. Tomographic upper and lower bounds (i.e., the most adverse conditions in each problem setting under which 
network tomography is possible, and corresponding schemes (computationally efficient, if possible) that achieve this 
performance) are presented for random linear network coding (RLNC). We consider RLNC designed with common 
randomness, i.e., the receiver knows the random code-books all nodes. (To justify this, we show an upper bound 
for the problem of topology estimation in networks using RLNC without common randomness.) In this setting 
we present the first set of algorithms that characterize the network topology exactly. Our algorithm for topology 
estimation with random network errors has time complexity that is polynomial in network parameters. For the 
problem of network error localization given the topology information, we present the first computationally tractable 
algorithm to localize random errors, and prove it is computationally intractable to localize adversarial errors. 2. New 
network coding schemes are designed that improve the tomographic performance of RLNC while maintaining the 
desirable low-complexity, throughput-optimal, distributed linear network coding properties of RLNC. In particular, 
we design network codes based on Reed-Solomon codes so that a maximal number of adversarial errors can be 
localized in a computationally efficient manner even without the information of network topology. The tomography 
schemes proposed in the paper can be used to monitor networks with other glitches such as packets losses and link 
delays, etc. 
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I. Introduction 

The goal of passive network tomography (or passive network monitoring) is to use end-to-end observations of 
network communication to infer the network topology, estimate link statistics such as loss rate and propagation 
delay, and locate network failures lO. 

In networks using linear network coding each node outputs linear combinations of received packets; this has 
been shown to attain optimal multicast throughput [4 1 . In fact, even random linear network codes (where each node 
independently and randomly chooses the linear combinations used to generate transmitted packets) suffice to attain 
the optimal multicast throughput ||5l, 10 , ||7'|. In addition to their desirable distributed nature, such schemes also 
have low design and implementation complexity 161, Q. 

The main observation driving this work is that the linear transforms arising from random linear network coding 
have specific relationships with the network structure, and these relationships can significantly aid tomography. 
Prior work ||8l||9l has also observed this relationship. 

Toy example for error localization: Consider the tomography problem in Figure [T] Source s transmits probe 
symbols ( say 1 and 2) to receiver r via intermediate node u. Suppose edge ei is erroneous and adds (say) 2 to 
every symbol transmitted over it. Receiver r knows the probe symbols, network, and communication schemes a 
priori. It also knows one of the links is erroneous (though it doesn't know in what manner), and wants to locate 
the erroneous link. 




62 64 62 64 

(a) Routing Case (b) Coding Case 



Fig. 1. A tomographic example for locating an error at edge ei. In Figure 1(a) observing error vector E — [2 0]"^ is not enough to 
distinguish the error locations ei and 62. In Figure 1(b), since network coding is used by intermediate node u, the information oi E — [2 2]^ 
is enough to locate the erroneous edge ei. 

The case where the network communicates only via routing is shown in Figure 1(a). The probe symbols 1 and 2 
are transmitted over edges ei and 62 respectively to node u. Due to the error introduced over ei, node u receives 
symbols 3 and 2 via edges ei and 62 respectively, and forwards them to node r via edges via edges 63 and 64 
respectively. Node r receives two symbols from 63 and 64, denoted by the vector y = [3 2]^. Since r knows that 
probe symbols a priori, it can compute the error vector to be £^ = y — [1 2]^ = [2 0]^. Using E and its knowledge 
of the routing scheme, node r can infer that the error happened in the routing path {ei, 63}, but can not figure out 
whether the error occurred on ei or 63. 
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Figure 1(b) shows the case where node u appUes Unear network coding to transmit symbols. In particular, node 
u outputs X3 = xi + 2x2 to link 63 and X4 = xi + X2 to 64, where xi and X2 are the symbols that node u receives 
from ei and 62, and X3 and X4 are the symbols to be sent over 63 and 64. For a unit additive error e = 1 at ei, 62, 
63 or 64, the receiver r would observe error vectors e[l 1]^, e[2 1]^, e[l 0]^ or e[0 1]^ respectively. Thus, errors in 
different links result in observed error vectors corresponding to vector spaces. Such linear algebraic characteristics 
of networks can be exploited to locate the erroneous link. Specifically, if error e = 2 is injected into ei, node r 
receives y = [7 5]^. Knowing in advance the probe symbols and node u's coding scheme, the receiver r computes 
the error vector as = y — [1 + 2 • 2, 1 + 2]^ = [2 2]^. Upon observing E = [2 2]-^ and comparing with the set of 
possible error vectors corresponding to different error locations, r can determine that ei is the erroneous link and 
the error is e = 2. □ 

While the toy example above might give the impression that the coding scheme needs to be carefully designed for 
the communication problem at hand, our results in this paper show that in fact random linear coding suffices to result 
in tomographic schemes that are distributed and have low computational and communication overhead. Further, 
if end-to-end network error-correcting codes (see for instance ifTOlllTTI ) are used for the network communication 
layer, in addition network tomography can also be implemented in a "passive" manner, i.e., no dedicated probe 
messages are necessary. Thus throughout this work, the phrase "network tomography" stands for "passive network 
tomography" unless otherwise specified. 

In this work we consider a network in which all nodes perform linear network coding. Besides receiving the 
messages, the receiver(s) wants to recover the network topology, and then detect and locate adversarial attacks, and 
random glitches (errors or erasures). 

We perform a comprehensive study of passive network tomography in the presence of network errors, under the 
setting of network coding. In particular, we seek answers to the following questions: 

• In networks performing random linear network coding (RLNC), what are the appropriate tomographic upper 
and lower bounds? That is, what are the most adverse conditions in each problem setting under which network 
tomography is possible, and what schemes (computationally efficient, if possible) achieve this performance? 

• Are there any linear network coding schemes that improve upon the tomographic upper bounds for RLNC while 
maintaining their desirable low-complexity, throughput-optimal, distributed linear network coding properties? 



A. Main contributions 



We now examine the relationship that linear transforms arising from random linear network coding have with 
the structure of the network. For this we find it useful to define the impulse response vector (IRV) t'(e) for every 



link e as the transform vector from link e to the receiver (see Section III-A for details). As shown in subsequent 



sections, each t'(e) can be treated as the fingerprint of corresponding link e. Any error on e exposes its fingerprint, 
allowing us to detect the location of the error. Note that all the tomography schemes proposed in the paper for 
network errors can be used to monitor networks with other glitches such as network erasures (i.e., packet losses) 
and link delays. We delay discussion on these related topics to the Appendix. 

• For network tomography under RLNC, our results are categorized into two classes: 

1) Topology estimation. For networks suffering from random or adversarial errors, we provide the first algo- 
rithms (under some sufficient conditions) that estimate the network topology (in the case of random errors, our 
algorithms are computationally efficient). We also provide necessary conditions for such topology estimation 
to be possible (there is currently a gap between our necessary and sufficient conditions). Common randomness 
is assumed, i.e., that the coding coefficients of each node are chosen from a random code -book known by the 
receiver. Note that the adversaries are allowed to access such knowledge. Without such knowledge, we prove 
that in the presence of adversarial or random errors it is either theoretically impossible or computationally 
intractable to estimate topology accurately. 

2) Error localization. We provide the first polynomial time algorithm for locating edges experiencing random 
errors. For networks suffering from adversarial errors we provide an upper bound of the number of locatable 
errors, and also a corresponding (exponential-time) algorithm that matches this bound. Moreover, we provide 
the first proof of computational intractability of the problem. Note that as with error-localization schemes in 
the previous literature (lH, |[T2ll . |[T3l . |[T4l ). the schemes we provide for RLNC require the information of 
network topology and the local linear coding coefficients - this can be from the topology estimation algorithms 
in this work, or as part of the network design a priori. 

• In the other direction, to circumvent the provable tomographic limitations of RLNC, we propose a specific 
class of random linear network codes that we call network Reed-Solomon coding (NRSC), which have the 
following three desirable features: 

1) NRSC are linear network codes that are implemented in a distributed manner (each network node only 
needs to know the node-IDs of its adjacent neighbors). 

2) With high probability over code design NRSC achieves the the multicast capacity. 

3) NRSC aids tomography in the following two aspects: 

- Computational efficiency. Under the adversarial error model, the receiver can locate a number of adversarial 
errors that match a corresponding tomographic upper bound in a computationally efficient manner.For the 
random error model, an lightweight topology algorithm is provided under NRSC. 

- Robustness for dynamic networks. For adversarial (and random) error localization the algorithms under 
NRSC do not require the priori knowledge of the network topology and thus are robust against dynamic 
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network updating. For topology estimation in the random error model, the the algorithm under NRSC fits 
for dynamic networks better than the one under RLNC. 
In Table |l] we compare our results and previous works on computational complexity. 

TABLE I 

Comparison our results and previous works on computational complexity 



Objective 


Failure model 


Tomography for 
RLNC [Previous works] 


Tomography for 
RLNC[This work] 


Tomography for 
NRSC[This work] 


Topology Estimation 


Adversarial Errors 
Random Errors 




Exponential 
Polynomial 


Polynomial 


Failure Localization 


Adversarial Errors 
Random Errors 


Exponential! 8 1 
ExponentialJSl, 113 


Hardness Proof 
Polynomial 


Polynomial 
Polynomial 







B. Related work 

Common randomness: Essentially all prior tomography results for RLNC assume some form of common random- 
ness, i.e.,, the receiver is assumed to have prior knowledge of the random coding coefficients used by internal nodes. 
Some previous results Q, lH, |[T4l for locating errors under RLNC do not explicitly assume common randomness, 
but assume the receiver knows all the linear coding coefficients employed by each node in the network, which is 
related to our notion of common randomness. 

We summarize related work on network inference under the following categories. 

1) Passive tomography: The work in |9j| provided the first explicit (exponential-time) algorithm for estimating 
the topology of networks performing RLNC with no errors. The work in [8] studied the problem of locating 
network errors for RLNC with prior knowledge of network topology. In particular, error localization can be 
done in time 0(j^J), where \£\ is the number of links in the network and z is the number of errors the 
network experiences. 

2) Active tomography: The authors in lfT2l . |fT3l . and |[T4l perform network tomography by using probe packets 
and exploiting the linear algebraic structure of network coding. The setting considered in these works concern 
active tomography, whereas in this work we focus on passive tomography. 

a) Random error localization: The authors in |[T2ll and |[T4l study erroiQ localization in a network using 
binary XOR coding. Using pre-designed network coding and probe packets, they show that the sources 
can use fewer probe packets than traditional tomography schemes based on routing. Again, ^^('f' ) is 
the computational complexity of localization, 
'in fact network erasures are considered in tiieir works. Here we classify network erasures as a subclass of network errors. 
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b) Topology estimation: For binary-tree networks using pre-designed binary XOR coding |[T3l show that 
the topology can be recovered by using probe packets. The authors in |[T5]| generalize the results to 
multi-source multi-receiver scenario. 
3) Network inference with internal nodes' information: Another interesting set of works ( |[T6l . IfTTll . |[T8l ) infer 
the network by the "packet information" of each internal node. In particular, the work in |17| discusses 
the subspace properties of packets received by internal nodes, the work in |[T6ll infers the bottlenecks of 
P2P networks using network coding, and the works in lITSl . |[T9l provides efficient schemes to locate the 
adversaries in the networks. Note that these works require the topology estimator to have access to internal 
network nodes. 



C. Organization of the paper 

The rest of this paper is organized as follows. We formulate the problem in Section |ll] and present preliminaries 



in Section III We then present our main technical results. Our results for network tomography consist of two parts: 



Part I considers RLNC, the schemes for topology estimation in the presence of network adversary and random errors 



are presented in Section IV and the schemes for error localization is presented in Section W\ Part II, consists of a 



particular type of RLNC, network Reed-Solomon coding (NRSC), in Sections Wn Section VII and Section VIII 



II. Problem Formulation and Preliminaries 

A. Notational convention 

Scalars are in lower-case (e.g. z). Matrices are in upper-case (e.g. X). Vectors are in lower-case bold-face (e.g. 
e). Column spaces of a matrix are in upper-case bold-face (e.g. E). Sets are in upper-case calligraphy (e.g. Z). 



B. Network setting 

For ease of discussion, we consider an direct acyclic and delay-free network Q = (V, £) where V is the set 
of nodes and £ is the set of edges. Each node has a unique identification number known to itself. Such a label 
could correspond to the node's GPS coordinates, or its IP address, or a factory stamp. The capacity of each edge 
is normalized to be one symbol of per unit time. We denote e{u, v) as the edge from node u to v. For each 
node V £ V, let ln{v) be the set of all incoming edges (or nodes) of v and Out{v) be the set of all outgoing edges 
(or nodes) of v. The out-degree of node v is defined as |Out(u)| and in-degree of node v is defined as |In(f)|. 

Note that all the results in the paper can be generalized to the scenario where edges with non-unit capacity 
are allowed. Non-unit capacity edge is modeled as parallel edges, which can be notated by somewhat unwieldy 
notations, say e(u, v, i), which stands for the i'th parallel edge from u to v. 
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We focus on the unicast scenario where a single source s communicates with a single receiver r over the network. 
In principle, our results can be generalized to other communication scenarios where RLNC suffices. For instance, 
in the networks with multiple receivers, we assume all incoming edges of the receivers are reconnected to a virtual 
receiver who performs network tomography. 

Let C be the min-cut (or max-flow) from s to r. Without loss of generality, we assume that both the number of 
edges leaving the source s and the number of edges entering the receiver r equal C. We also assume that for every 
node in V, there is at least one path between the node and the receiver r; Otherwise the node does not involve in 
the communication and is irrelevant to our study. 

C. Dependency 

Any set of z edges ei,e2,...,e2 is said to be flow-independent if there is a path from the tail of each to 
the receiver r, and these z paths are edge-disjoint. The flow-rank of an edge-set Z equals the max-flow from 
the tails of edges in Z to the receiver r. A collection of edge-sets Zi, Z2, Zn is said to he. flow-independent if 
flow-rank{yJ}^^Zi) = Y^^=i flow- rank (Zi). The flow-rank of an internal node equals to the flow-rank of its outgoing 
edges. For the set Z £ with flow-rank z, the extended set (or Ext{Z)) is the set that is of flow-rank z, includes 
Z and is of maximum size. Note that Ext{Z) is well-defined and unique ISUI . 

D. Network transmission via linear network coding 

In this paper we consider the linear network coding scheme proposed in ll2Tll . Let each packet have n symbols 
from Fg, and each edge have the capacity of transmitting one packet, i.e., a row vector in F^^". 

Source encoder: The source s arranges the data into a C x n message matrix X over Fg. Then on each outgoing 
edge of s a linear combination over F^ of the rows of X is transmitted. X contains a pre-determined "short" header 
(say, the identity matrix in F^^*-^) known in advance to both the source and the receiver, to indicate the linear 
transform from the source to the receiver. 

Network encoders: Each internal node similarly takes linear combinations of the packets on incoming edges to 
generate the packets transmitted on outgoing edges. Let x(e) represent the packet traversing edge e. An internal 
node V generates its outgoing packet x(e') for edge e' G 0\A{v) as 



where /3(e,f,e') is the linear coding coefficient from the packet x(e) to the packet x(e') via v. As a default let 
/3(ti, V, w) = /3(e, V, e'), where e = {u, v) and e' = (w, w). 

Receiver decoder: The receiver r constructs the C x n matrix Y over ¥g by treating the received packets as 
consecutive length-n row vectors of Y. The network's internal linear operations induce a linear transform between 




(1) 



e£ln(ti) 



8 



X and Y as 

Y = TX, (2) 

where T G F^^*-^ is the overall transform matrix. The receiver r can extract T from the packet headers (recall that 
internal nodes mix headers in the same way as they mix messages). Once T is invertible the receiver can decode 

Xhy X = T-^Y. 

E. Network error models 

Networks may experience disruption as a part of normal operation. Edge errors are considered in this work - 
node errors may be modeled as errors of its outgoing edges. 

Let x(e) G F^^" be the input packet of e. For each edge e G <5 a length-n row-vector z(e) is added into x(e). 
Thus the output packet of e is y(e) = x(e)-i-z(e). Edge e is said to suffer an error if and only if z(e) is a non-zero 
vector. 

Both adversarial and random errors are considered: 

1) Random errors: every edge e in £ independently experiences random errors with a non-negative probability. 
A random error on e means that z(e) has at least one randomly chosen position, say i, such that the i'th 
symbol of z(e) is chosen from uniformly at random.]^ 

2) Adversarial errors: The network is said to have z adversarial errors if and only if the adversary can arbitrarily 
choose a subset of edges Z (1 £ with \Z\ = z and the corresponding erroneous packets {z(e), e G Z}. Note 
that the adversary is assumed to have unlimited computational capability and has the access to the information 
of the source matrix X, network topology, all network coding coefficients and tomography algorithms used 
by the receiver. 

F. Tomography Goals 

The focus of this work is network passive end-to-end tomography in the presence of network errors. There are 
two tomographic goals: 

1) Topology estimation: The receiver r wishes to correctly identify the network topology upstream of it {i.e., 
the graph Q). 

2) Error location: The receiver r wishes to identify the locations where errors occur in the network. 
Remark: In fact, all tomography schemes in the paper can be generalized in the following manner. Instead of 

the incoming edges In(r) of the receiver r, consider any cut £c of edges that disconnects source s from receiver 

^Note the difference of this model from the usual model of dense random errors on |22|, wherein z(e) is chosen from at random. 
The model described in this work is more general in that it can handle such errors as a special case. However, it can also handle what we 
call "sparse" errors, wherein only a small fraction of symbols in z(e) are non-zero. Such a sparse error may be a more natural model of 
some transmission error scenarios 1231 , 1241 . They may also be harder to detect. In our model we consider the worst-case sparsity of 1. 



r. A network manager that has access to the packets output from Eq can use the tomography schemes in the paper 
to estimate the topology of the upstream network and locate the network errors. 

G. Network error-correcting codes 

Consider the scenario where a randomly or maliciously faulty set of edges Z of size z injects faulty packets into 
the network. As in [11], the network transform (|2]) then becomes 

Y = TX + E. (3) 



Note that the C x n error matrix E has rank at most z (see Section III-C for details). The goal for the receiver 
r in the presence of such errors is still to reconstruct the source's message X. Note that the loss-rate 2z/C is 
necessary and sufficient for correcting z adversarial errors ifTTl . lITOl . while the loss-rate {z + 1)/C is necessary 
and sufficient |[TOl for correcting z random errors. 

In this work we use the algorithms of fr\\ for adversarial errors, and the algorithms of [ 10] for random errors. 
All our tomography schemes are based on the conect using of these network error-correcting codes. 

H. Computational hardness of NCPRLC 

Several theorems we prove regarding the computational intractability of some tomographic problems utilize the 
hardness results of the following well-studied problem. 

The Nearest Codeword Problem for Random Linear Codes (NCPRLC) is defined as follows: 

• NCPRLC: {H,z,e): Given a parity check matrix H which is chosen uniformly at random over 

]piixi2 with 

I2 > h, a. constant z, and a vector e G H which is linear combined from at most z columns of H. The 
algorithm is required to output a z-sparse solution b for Hh = e, i.e., e = Hh and b has at most z nonzero 
components. 

Note that NCPRLC is a well known computational hard problem ||25]|, ||25l, ||26ll . 

/. Decoding of Reed-Solomon codes 

This section introduces some properties of the well-studied Reed-Solomon codes (RSCs) [27], used in particular 
for worst-case error-correction for point-to-point channels. A Reed Solomon code (RSC) is a linear error-correcting 
code over a finite field Fg defined by its parity check matrix H G F^^^'^ with I2 > h. Here /i + 1 is minimum 
Hamming distance of the code, i.e., minimum number of nonzero components among the codewords belonging to 
the code. In particular, H is formed as 

i/ = [hi,h2,...,hzj, (4) 
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where hj = [hi, {hi)'^, [hiY^]'^ G and hi ^ for each i G [1, 12] and hi / hj for i / j. 

Given e which is a Unear combination of any z < (/i + 1)/2 columns of H, the decoding algorithm of RS-CODE, 
denoted as RS-DECODE(i7, e), outputs a z-sparse solution of Hh = e with 0(^2^1) operations over Fg (see |28|). 
That is, b € F^^ has at most z non-zero components and e = Hh. Further more, for any b' 7^ b either e 7^ Hh' 
or b' has more than z non-zero components, i.e., h is the unique z-sparse solution of Hh = e. 

III. Impulse Response Vector (IRV) 

In this section, we explain the relationship between the linear transforms induced by the linear network coding 
and the network structures, by introducing the concept of impulse response vector (IRV). The relationship forms 
the mathematical basis for our proposed tomography schemes. 

A. Definition of Impulse Response Vector (IRV) 

Corresponding to each edge e G we define the length-C impulse response vector (IRV) t'(e) G F^ as the 
linear transform from e to the receiver. In particular, let the source s transmit the all-zeroes packet G F^ on all 
outgoing edges, let edge e inject a packet z(e) G F^, and let each internal node perform the linear network coding 
operation. Then the matrix Y received by the receiver r is y = t'(e)z(e) G F^^". So t'(e) can be thought of as 
a "unit impulse response" from e to r. 

An illustrating example for edge IRVs is in Figure 2 and Figure [3j where the coding coefficients are shown in 
Figure 2 and the packet length is assumed to be 1. In Figure [3ja), only 64 has an injected symbol 1 and what r 
receives is y = [1,0]^, thus the IRV of 64 is t'(e4) = [1,0]^. For the same reason, the IRVs of 65,63,62 and ei 
are computed in Figure [3jb) , Figure [3jc), Figure [3jd) and Figure [3je) respectively. 

For a set of edges Z (1 E with \Z\ = z, the columns of the C x z impulse response matrix T'{Z) comprise of 
the set of vectors {t'(e) : e G Z}. 

All IRVs can be inductively computed. First, the IRV for each edge incoming to the receiver is set as a distinct 
unit vector, i.e., a distinct column of the C x C identity matrix. Then for each edge e incoming to node v with 
outgoing edges {ei, 62, e^} we have 

= X] /3(e,^^,ej)t'(ej). 

j=l,2,...,d 



B. IRVs under random linear network coding (RLNC) 

The linear network coding defined in Section II-D is a random linear network coding (RLNC) if and only if [61 : 
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s 




(a) t'(ei) (b) t'ies) (c) t'(e3) (d) t'(e2) (e) t'(ei) 



Fig. 3. The IRVs of the edges shown in Figure 2: t'(e4) = [1,0]^, t'(e5) = [0, 1]^, t'(e3) = t'(e5) = [0, 1]^, t'(e2) = 2t'(e5) = [0, 2f 
and t'(ei) = 3t'(e4) + 2t'(e3) = [3,2]^. Ed ges 62 and 63 are not flow-independent, so the IRV t'(e2) equals the t'(e3) (up to a scalar 
multiple). Conversely, ei and 65 are flow-independent, so t'(ei) is linearly independent from t'(e5). 



Source encoder: The source s takes C independently and uniformly random linear combinations of the rows of 
X to generate respectively the packets transmitted on each edge outgoing from s (recall that exactly C edges leave 
the source s). 

Network encoders: Each internal node, say v, independently and uniformly chooses its local coding coefficients 
{/3(e, V, e'), e G In('t;), e' G Out(f )} at random. 

Receiver decoder: As shown in Equation Q, the receiver r receives y as y = TX , where T is the overall 
transform matrix. It is proved that with a probability at least 1 — \£\/q the matrix T is invertible for RLNC 
The receiver extracts T from the header of Y and decodes X hy X = T~^Y . 



For RLNC, the linear transforms defined in Section III-A provides an algebraic interpretation for the graphes. 
To be concrete. Lemma [T] below states that the linear independence of the IRVs has a close relationship with the 
flow-independence of the edges. The relationship is used in tomography schemes shown in later sections. 

Lemma 1: 1) The rank of the impulse response matrix T'{Z) of an edge set Z with flow-rank z is at most 
z. 

2) The IRVs of flow-independent edges are linear independent with a probability at least 1 — \£\/q. 
Proof: 
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1) When the flow -rank of Z is z, the max-flow from Z to r is at most z. If the rank of T'{Z) is larger than 
z, say z + 1, Z can transmit information to r at rate z + 1, which is a contradiction. 

2) For an flow-independent edge set Z with cardinality z, assume a virtual source node s' has z virtual edges 
connected to the headers of Z, and all outgoing edges (except for Z) of the headers of Z are deleted. The 
max-flow from s' to r is z and Z is a cut. Then T'{Z) has rank z if and only if s' can transmit information 
to r at rate z. By a direct corollary of Theorem 1 in ||5l, this happens with a probability at least 1 — \£\/q. 

□ 

Thus for a large enough field-size q, properties of the edge sets map to the similar properties of the IRVs. 
For instance, with a probability at least 1 — \£\/q, flow-rank(uf^iZi) = Y^'^^^flow-rank(Z,i) if and only if 
rank(T' (uf^iZi)) = Y^i=irank{T' (Zi)). Thus by studying the ranks of T'{Zi), we can infer the flow-rank 
structures of Zi. 

The example in Figure 3 also shows the relationship between flow-independence and linear independence. 
C. IRVs for network errors 

Assume a faulty set of edges Z of size z injects faulty packets into the network, i.e., Z = {e : e £ £, z(e) 7^ 0} 
and \Z\ = z. From the definition of IRV, we have: 

Y = TX + T'{Z)Z, (5) 

where Z is a z x n matrix whose rows comprised of erroneous packets {z(e) : e € Z}. Thus the error matrix E 
defined in Equation (j3]l (of Section II-Gl equals T'{Z)Z and has rank at most z. 



Part I: Network Tomography for Random Linear Network Coding (RLNC) 
IV. Topology estimation for RLNC 

A. Common randomness 

Common randomness means that all candidate local coding coefficients {(3{u,v,w),u,w G V} of node w G V 
are chosen from its local random code-book TZ-^, and the set of all local random code-books TZ = {TZv, w € V} is 
known a priori to the receiver r. Note that TZ can be public to all parties including the adversaries. 

Common randomness is both necessary and sufficient for network topology estimation under RLNC. On one 
hand the sufficiency is followed by the works in ||9l and this section. On the other hand we show that in the presence 
of adversarial (or random errors), determining the topology without assuming common randomness is theoretically 
impossible (or computationally intractable) later in Theorem [2] and Theorem [3] 
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Each local random code-book in TZ comprises of a list of elements from ¥q, with each element chosen inde- 
pendently and uniformly at random. These random code-books can be securely broadcasted by the source a priori 
using a common public key signature scheme such as RSA ||29l . or as part of network design. 

Depending on the types of failures in the network, we define two types of common randomness. Recall that 
(3{u,v,w) is the local coding coefficient from edge e{u,v) via v to the edge e'{v,w) (see Section [Tl-Dl for details). 
1) Weak type common randomness for random errors: For node u G V each distinct element {u,w) in V <^V 
indexes a distinct element in TZy. The local coding coefficient f3{u,v,w) is chosen as the element Tly{u,w). 
For instance consider the subnetwork shown in Figure 4. Under weak type common randomness, Figure 5 
shows how node vi chooses the coding coefficient Piv2,vi,Vi). 




Fig. 4. The adjacent neighbors of node vi. 



v^ V9 v:^ v/j vf, 
^i| I I I I 




Fig. 5. Under wealc type common randomness, node vi in Figure 4 chooses l3{v2,vi,V4) as the element shown in the dark region. 

2) Strong type common randomness for adversarial errors: For node v each distinct element {u,w,w') in 
V (8> V (E> V indexes a distinct element in TZy. For an instance network, recall that Out(u) is the outgoing 
edges of v. The coding coefficients /3(m, v, w) is chosen as 

/3{u,v,'w)= Tly{u,w,w'). (6) 

e(i',«)')£Out(t') 

For instance consider the subnetwork shown in Figure 4. Under strong type common randomness. Figure 6 
shows how node vi chooses the coding coefficient P{v2,vi,v/i). 
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VI 



bvi 


bv2 




bv^ 


bvr^ 


bvQ 


• • • 



Fig. 6. Under strong type common randomness, node vi in Figure 4 cliooses /3(i'2, ^"1, ^^4) as b^^ + b^^ + b^^ 



Remark 1: For strong type common randomness, since 1) all symbols in TZ^ are independently and uniformly chosen 
over finite field ¥q and 2) for different coding coefficient f3{u, v, w) the summation in equation ^ involves distinct 
elements in Ky, the coding coefficients f3{u,v,w) chosen by equation ^ is also independently and uniformly 
distributed over F^. 

Remark 2: For adversarial errors it is required that the existence of an edge e{v, w) would effect the coding 
coefficients {/3{u,v,w') : w' 7^ w}. Otherwise, if the adversary corrupts e{v,w) and only sends all-zero packet on 
e{v, w), the receiver is impossible to notice the existence of e{v, w). Thus a different type of common randomness 
is used for network suffering adversarial errors. 

Remark 3: Assuming the common randomness, given the knowledge of network topology all local coding coeffi- 
cients are known. Thus the IRVs of the edges can be computed efficiently. 

Remark 4: For network with parallel edges the random code-book TZ^ can be described by somewhat unwieldy 
notations. For instance, under weak common randomness the element Tly{u,w,i,j) is for the coding coefficient 
from edge {u,v,i) {i.e., the ith parallel edge between u and v) to {v,w,j) via v. 

We first prove the necessity of using common randomness for topology estimation in networks with adversarial 
errors. Since the network adversaries can hide themselves and only inject zero errors, it suffices to prove common 
randomness is necessary for topology estimation in networks with zero errors. 

Theorem 2: If internal nodes choose local coding coefficients independently and randomly without assuming 
common randomness, there exist two networks which can not be distinguished by the receiver in the absence of 
network errors. 

Proof: Since the overall transform matrix (see Equation (|2]) for details) is the only information the receiver can 
retrieve from the receiving packets, it suffices to prove the overall transform matrixes of Expl and Exp2 in Figure 7 
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are statistically indistinguishable. 

For Expl, let Ts(l) G F^^^ be the transform matrix from s to ui. Similarly, matrices T^^ G F^^^, T^^ G F^^^, 
and T„3 G F^^^ are the transform matrices from m, U2, U3 to the adjacent downstream nodes respectively. Thus, 
the transform matrix r(l) from s to r in Expl is r(l) = Tu^Tu^Tu,Ts{l). 

For the similar reason, the transform matrix T(2) from s to r in Exp2 is T(2) = T^^T^^T^-^Ts{2). 

Since each element in T^g, T^^, T^^, Ts(l), T^g, T^^, T^^, Ts{2) is independently and uniformly chosen at random, 
r(l) is statistically indistinguishable from T{2). □ 




Expl Expl 



Fig. 7. Two networks that are impossible to distinguish by the receiver. 



For the random error model (see Section II-E for details). Theorem |2] does not suffice to show the necessity of 
common randomness. The reason is that in a zero error network the only network information observed by the 
receiver is the transform matrix T, while in networks suffering random errors, an random error on the edge may 
expose its IRV information which aids topology estimation. In the following it is proved that without assuming 
common randomness topology estimation is at least as computationally intractable as NCPRLC (see the definition 



in Section |II-H| for details). 

For random error model, as in (|5]l, the receiver gets Y = TX^E, where E = T'{Z)Z. Thus E and T are all the 
information observed by the receiver r. Let Ti^y be the set of vectors, each of which equals an IRV of an edges 
in the network. Note that Ti^y is merely a set of vectors, and as such, individual element has no correspondence 
with any edge in the network. When the edge suffers random errors independently, Z are errors chosen at random. 
Thus the error matrix E = T'{Z)Z can not provide more information than T'{Z), whose columns are in Ijrv- 
Thus it suffices to prove: 

Theorem 3: When the internal nodes choose local coding coefficients independently and randomly without 
assuming common randomness, if the receiver r can correctly estimate the topology in polynomial time (in network 
parameters) with knowing T and Zjrv <^ priori, NCPRLC can be solved in time polynomial (in problem parameters). 



Proof: Given a NCPRLC instance (H, z, e), as shown in Figure [Sj we construct a network with li edges to r and 
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Fig. 8. A network reduced from the NCPRLC instance {H, z, e). 
I2 edges to node u. 

Since H is a matrix chosen uniformly at random over Fg, it corresponds to a RLNC, where each column of H 
corresponds to an IRV of an edge in In{u). 

Let e be an edge whose tail is connected to z edges in In(u). Let the IRV of e be e. If the receiver r can recover 
the topology, r is able to tell how the tail of e is connected to the z edges in In(n). Thus r can find a linear 
combination of z columns of H resulting in e and thus solve {H, z,e). □ 

B. Topology estimation for networks with adversarial errors 

In this section, we use an error-correcting code approach lITTI to estimate the topology of a network with 
adversarial errors. At a high level, the idea is that in strongly connected networks, each pair of networks generates 
transform matrices that look "very different". Hence no matter what the adversary does, he is unable to make the 
transform matrix for one network resemble that of any other. The estimation algorithm and proof techniques are 
similar in flavor to those from algebraic coding theory. 

As is common in the network error-correcting literature, we assume that the adversary is bounded, and therefore 
corrupts no more than z edges in the network. 
Assumptions and justifications: 

1) At most z edges in Z suffer errors, i.e., {e : e ^ £, z(e) ^ 0} = Z and \Z\ < z. When 2z + 1 < C, network- 
error-correcting codes (see Section |II-G| for details) are used so that the source message X is provably 



decodable. 

2) Strong connectivity. A set of networks satisfies "strong connectivity" if the following is true: each internal 

node has both in-degree and out-degree at least 2z + 1. Note that in an acyclic graph it implies the source 

has at least 2z + 1 edge disjoint pathes to each internal node, which has 2z + I edge disjoint pathes to the 

receiver. We motivate this strong connectivity requirement by showing in Theorem [6] lower bounds on the 

connectivity required for any topology estimation scheme to work in the presence of an adversary]^ 

^Note that for the single source (or single receiver) network, such connectivity requires parallel edges at the source (or the receiver). 
Otherwise if parallel edges are not allowed, we assume the neighbors of the source (or the receiver) are the end-nodes, i.e., they are not in 
the domain of tomography. Similar argument holds in later sections. 
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3) Knowledge of local topology. We assume that each node knows the ID numbers of the nodes exactly one hop 
away from it, either upstream or downstream of it. 

4) Strong type common randomness is assumed. It is justified by Theorem [2] 

After receiving the overall transform matrix Tg which is polluted by the network adversarial errors, the receiver 
r using the following algorithm to estimate of topology of the network. 

. ALGORITHM I: TOPO-ADV-RLNC: Under RLNC, the algorithm is to use the end information observed 
by the receiver to estimate the network topology in the presence of network adversaries. 

• The inputs are Tg and TZ = {Tlv,v G V}. The output is a graph Q. 

• Step A: For each candidate graph Q with nodes in V and satisfying the strong connectivity requirement (see 
Assumption |2]) for details), goto Step B. 

• Step B: Using TZ, receiver r computes the overall transform matrix T{Q) for Q. If rank(r(^) — Tg) < z, output 
G and goto Step C; otherwise, go back to continue the loop in Step A. 

. Step C: End TOPO-ADV-RLNC. 

Before proving the correctness of TOPO-ADV-RLNC we show the key lemma for the rank distance of different 
graphes. The rank-distance between any two matrices A,B(^ jpCxC defined as rm{A,B) = mnk{A — B). We 
note that rank-distance indeed satisfies the properties of a distance function; in particular it satisfies the triangle 
inequality ifTTTl . 

Lemma 4: Let the transform matrices of different networks Q and Q' be T{Q) and T{Q') respectively. Then with 
a probability at least 1 - |V|^/g, rm{T{g),T{g')) >2z + l. 

Proof: Since Q / Q' , there exists a node u 7^ r in ^ which is either not in Q' or has an outgoing edge in Q but 
not in 

We first show that there exists a (2z + 1) x (2z + 1) sub-matrix in T{Q) — T{Q'), such that its determinant is not 
zero on an evaluation of the elements of the code-books in IZ. Using Assumption [2]), in Q there exist 2z + 1 edge 
disjoint pathes from s to r via u. The elements in TZ can be evaluated such that i) only the routing transmissions 
along these pathes are allowed; ii) in Q, the source s can transmit 2^ + 1 packets using routing via u to r; iii) the 
elements in IZu satisfy TZ{v, u, w, w') = if (u, w') / e„. 

Thus for graph Q, under such evaluation of IZ the transform matrix T{Q) has a sub-matrix as a (2z + l) x (22; + !) 
identity matrix. 

For the case where u Q' , since the receiver r can only receive the routing transmissions via u, the transform 
matrix T{Q') is therefor a zero matrix. 

For the case where u G Q' , since Q' , in Q' all local coding coefficients used by u are zero and therefor the 

''otherwise we can switch the roles of Q and Q' in the proof. 
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transform matrix T{Q') is still a zero matrix. 

Thus under such evaluation of TZ, T{Q) — T{Q') always has a {2z + 1) x [2z + 1) sub-matrix with determinant 

1. 

The determinant of the sub-matrix is a polynomial of random variables belonging to IZ, with degree at most 
\£\ X {2z + l) < |<?p < I V|^. Using Schwarth-Zippel Lemma ifBOl . with a probability at least 1 — ^ the determinant 
of the sub-matrix is nonzero, i.e., rm{T{Q) — T{Q')) > 2z + 1. □ 

Since there are at most 21^''/^ acyclic graphs and 2l^l' pairs of graphs to be compared, following a Union 
Bound [31] argument, with a probability at least 1 — |V|'^2l^l'/g the lemma is true for any pair of networks]^ 

As in Q, after transmission, the erroneous transfer matrix Te received by r is actually 

T, = T + T\Z)Zh, il) 

where represents the errors injected for the packet headers, i.e., the first C columns of Z. This combined with 
Lemma |4] enables us to prove the correctness of TOPO-ADV-RLNC. 

Theorem 5: With a probability at least 1 - \V\'^2\^^^ /q, the network G outputted by TOPO-ADV-RLNC is the 
correct network. 

Proof: We assume lemma|4]is true for any pair of graphs, which happens with a probability at least 1 — | V|^2l^l^/gr 
as stated above. 

By the rank distance rm{Te,T{g)) equals rank{T'{Z)Zh) < rank{T'{Z)) < z. For any transfer matrix 
T{Q') corresponding to a different network Q', by the triangle inequality of the rank distance, rm{T{Q'),Te) > 
rm{T{Q'),T{g)) - rm{T{g),Te) > z + 1. This completes the proof. □ 

In the end, we show that the strong connectivity requirements (see Assumption [2]l for details) we require for 
Theorem [5] are "almost" tighj^ 

Theorem 6: For any network Q that has fewer than z + 1 edges from the source s to each node, or fewer than 
2z + 1 edges from each node to the receiver r, there exists an adversarial action that makes any tomographic scheme 
fail to estimate the network topology. 

Proof: Assume node v has a min-cut 2z to the the receiver r, and the adversary controls a set Z of size z of 
them. When the adversary runs a fake version of the tomographic protocol announcing that v is not connected to 
the edges in Z, the probability that r incorrectly infers the presence of is 1/2. 

On the other hand, if v has only z incoming edges, the adversary can cut these off {i.e. simulate erasures on 

^For counting the total number of networks we do not count the the networks with parallel edges for clarity of exposition. When parallel 
edges is taken into count, the length of field size q should be 0(|V|^ log(|f |)) to make the failure probability of tomography negligible. 

*We remark that there is a mismatch between the sufficient connectivity requirement in Assumption [2} (that there be 22 + 1 edges between 
s and each node), and the necessary connectivity requirement of Theorem |6] (that there be 2: + 1 edges between s and each node). Whether 
the gap between such mismatch can be closed is still open. 
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these edges). Since the node can only transmit the message from its incoming edges, this implies that all messages 
outgoing from u are also, essentially, erased. Hence the presence of v cannot be detected by r. □ 
In fact, the proof of Lemma [4] only requires Q and Q' differs at a node with high connectivity. If we know the 
possible topology set a priori, we can relax the connectivity requirement. The following corollary formalizes the 
observation. 

Corollary 7: For a set of possible networks {Qi,G2, ■■■,Qd}, if any two of them differs at a node which has 
max-flow at least 2z + 1 from the source and max-fiow at least 2z + 1 to the receiver, with a probabihty at least 
1 — cf\V\'^ /q the receiver can find the correct topology by the receiving transform matrix. 



C. Topology estimation for networks with random failures 

Under RLNC, we provide a polynomial-time scheme to recover the topology of the network that suffers random 



network errors (the definition of random errors can be found in Section II -El. The receiver r proceeds in two 
stages. In the first stage (Algorithm II: FIND-IRV), r recovers the IRV information during several rounds of 
network communications suffering random errors. In the second stage (Algorithm III: FIND-TOPO), r uses the 
IRV information obtained to recover the topology. An interesting feature of the algorithms proposed is that random 
network failures actually make it easier to efficiently estimate the topology. 
Assumptions, justifications, and notation: 

1) Multiple "successful" source generations. A "successful" generation means the number of errors does not 
exceed the bound C— 1 and receiver r can decode the source message correctly using network error-correcting- 



codes (see Section II-G for details). The protocol runs for t independent "successful" source generations, 
where t is a design parameter chosen to trade off between the probability of success and the computational 
complexity of the topology estimation protocol. Let X{i) be the source messages transmitted, Z{i) be the 
edge set suffering errors and Y{i) be the received matrix in the ith source generation. 

2) Weak connectivity requirement. It is assumed that each internal node has out-degree no less than 2. Note it 
is the necessary condition that each edge is distinguishable from every other edge, i.e., any pair of edges are 
flow-independent (see the definition in Section II-C| for details). 

3) Each node knows the IDs of its neighbors. As in Section IV-B[ Assumption [3]l. 

4) The network is not "noodle like" {i.e., high-depth and narrow- widthQ To be precise for any distinct i, j G [l,t\ 
let the random variable V{i,j) be 1 if and only if Z{i) is flow-independent to Z{j), i.e., flow-rank{Z{i) U 

= fiow-rank{Z{i)) +flow-rank{Z{j)). Since the random network errors are independent of the source 

^ At a high-level, the problem lies in the fact that such networks have high description complexity (dominated by the height), but can 
only support a low information rate (dominated by the width). 
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generation, Pr{V{i,j) / 1) has no dependence on and is defined as pc- The network is not "noodle 
like" requires pc bounded away from 1. 

5) For each source generation, each edge e independently has random errors with probability at least p. Note 
that Assumption 1) and 4) require the typical number of error edges p\£\ in each source generation is no 
more than C. Thus we can assume p = Q{l/\£\). 

6) Weak type common randomness is assumed. It is justified by Theorem [3] 
Stage I: Find candidate IRVs 

Recall the source message is formed as X{i) = [Ic,M{i)], where /c is a C x C identity matrix and M{i) G 
¥q'" is the message. For any matrix N with n columns, let Nh (and Nm ) be the matrix comprised of the 
first C columns (and last n — C ) of A^. Then the algorithm that finds a set of candidate IRVs is as follows: 

• Algorithm II, FIND-IRV: The algorithm is to recover a set of candidate IRVs of the network from t 
"successful" source generations. 

• The input is {Y{i),i G The output is Iirv which is a set of dimension-one subspaces in and 
initialized as an empty set. 



• Step A: For i G [l,t], r computes using network error-correction-code (see Section II-G for details) and 
then E{i)r = Y{i)m - Y{i)hM{i). 

• Step B: The intersection of the column-spaces E(i)r n E(j)r is computed for each pair i,j G {!,••• ,t}. If 
ranA;(E(i)r n E(j)r) = 1 for any pair, E(i)r n E(j)r is added into Ijrv- 

. Step C: End FIND-IRV: 

Let Pa denote pc + 2ps + \£\/q and be 1 — (1 — z/q)[l — 2C^/(n — C)] and < v > be the dimension-one 
subspace spanned by any vector v. Then the theorem followed characterizes the performance of FIND-IRV. 

Theorem 8: The probability that Tiry contains {< t'(e) >: e G <S} is at least 1 — \£\p^a^'^ ■ 

The proof will be presented later. 
Remark 1: Each element in Ti^y has no correspondence with any edge in the network. Such correspondences 
would be found in next stage by algorithm FIND-TOPO. 

Remark 2: The probability ps asymptotically approaches with increasing block-length-n and field-size-g. Hence 
Pa is bounded away from 1 using Assumption |4]). Thus if t = Q{\og{\£\) /p), the probability that Tjry contains 
{< t'(e) >: e G £} is 1 — o(l). Since p = Q{l/\£\), without loss of generality we henceforth assume t = 

@{\£\\ogm). 

Remark 3: Since 2ps + \£\/q is asymptotically negligible for large block-length n and field size q, pa approximately 
equals pc- Also Lemma [T] and Lemma [9] imply that for large n and q, any two failing edge-sets Z{i) and Z{j) 
across multiple source generations are flow-independent if and only if the corresponding error matrices E{i)j. and 
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Fig. 9. Let Z{1) = {61,64} and Z{2) = {62,63} and ranfc(t'(62), t'(e3), t'(e4)) = 3, then we have ranfc(T'(e2, 63) n T'(ei, 64)) = 1 
and T'(e2,e3) n T'(ei,e4) = [t'(e2) + 2t'(e3)], which is a "fake candidate". 

E{j)r are column linearly independent. Thus r can estimate 1 — Pa and hence 1—pchy estimating the probability 
that pairs of E(i)r and E(j)r are linearly independent. This enables r to decide how many communication rounds 
t are needed so that FIND-IRV has the desired probability of success. 

Remark 4: The set of vectors output by FIND-IRV can also include some "fake candidate", as demonstrated in the 
example in Figure [9] In the next stage for topology estimation, these fake IRVs will be filtered out automatically. 

Before the proof of Theorem [8j we show the following lemma arisen from the properties of random errors and 
is a core lemma for network tomography in random network errors. 

Lemma 9: For random error model, E(i)r = T'{Z{i)) with a probability at least 1 — Ps- 
Proof: Recall that Z{i)m comprised of the last n — C columns of Z. We first prove that Z{i)m has full row rank 
z with a probability at least I — ps. 

In the random error model (see Section |II-E for details) each error edge e has at least one randomly chosen 



location (say £) in the injected packet z(e) such that the £th component of z(e) is chosen uniformly at random 
from ¥q. Thus for each row of Z{i), all the last n — C elements are zero with a probability at most C/n. Using 
Union Bound [31] Z{i)m has zero rows with a probability at most C"^ /n. Thus in the following we assume each 
row of Z{i)m is non-zero. 

The "Birthday Paradox" jSTTl implies that with a probability at least 1 — C'^/{n — C), for each row of Z{i)m, 
the following happens: there are z distinct column indexes li, . . . ,1^ S {1, . . . , n — C} such that Z{i)m{i, h) is 
chosen uniformly at random. Then the determinant of the sub-matrix of the {/i, . . . ,/z}th columns of Z{i)m is 
a nonzero polynomial of degree z of uniformly random variables over Fg. By the Schwartz-Zippel Lemma |[30l 
this determinant is non-zero with a probability at least {1 — z/q). Thus Z{i)rn has z independent columns with a 
probability at least (1 - z/q)[l - 20"^ /{n - C)] = I -ps- 

Since E{i)j. = E{i)m—E{i)hM{i) = T'{Z{i)){Z{i)m — Z{i)hM{i)) and the non-zero random variables in Z{i)m 
are chosen independently from Z{i)hM{i), {Z{i)m — Z{i)hM{i)) has full row rank z with the same probabiUty 
1 - Ps- Thus E(i)r = T{Z{\)) with a probability at least I - Ps- □ 

Then we have: 
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Proof of Theorem |8| For any edge e and any i,j G {1, . . . ,t} and e G Z{i) n 2(i), we compute the probability 
of the event F{e,i,j): < t'(e) > equals the one-dimensional subspace E(i)r n E(j)r. 

By Assumption [4]), with a probability at least 1 — Pc, — e is flow-independent of Z{j) — e. Conditioned 
on this, Lemma T|2 implies that with a probability at least \ — pc — \£\/q, T'(Z(i)\e) is linearly independent 



of T'(Z(j)\e). Hence T'(Z(i)) n T'(Z(j)) equals < t'(e) >. And by Lermna|9| either of E(i)r / T'(Z(i)) or 
E(j)r 7^ T'{Z{])) with a probability at most p^. Conditioning on all the events implies that the probability of event 
F{e,i,j) is at least 1— pc — 2ps — \£\/q- 

When t is large enough, by the Chernoff bound |[3TI e will fail at least tp/2 times with a probability at least 
1 — Conditioned on these many failures, there are tp/i probabilistically independent F{e,i,j) for edge e, 

and FIND-IRV accepts t'(e) with a probability at least 1 — {p^a^^ Taking the Union Bound over all edges 

gives the required result. □ 
Stage II: Topology recovery via candidate IRVs 

Using Xjjiv, we now describe Algorithm FIND-TOPO that determines the network topology. 

Note that X/^jy is merely a set of dimension-one subspaces, and as such, individual element may have no 
correspondence with the actual IRV of any edge in the network. At any point in FIND-TOPO, let Q denote the 
network topology recovered thus far. Let V and £ be the corresponding sets of nodes and edges respectively in Q, 
and iiRv be the set of IRVs of the edges in £, which are computed from Q and the set of local random code-books 
TZ = {TZv : f G V}. We note that the IRVs in Xjjiv are vectors rather than dimension-one subspaces. 

We describe algorithm estimating the network topology as follows. 

• Algorithm III, FIND-TOPO: The algorithm is to use X/^y and TZ = {TZ^,v G V} to recover the network 
topology. 

• The input is Ijrv and TZ. The output is ^ = {V,£). 

• Step A: The set V is initialized as the receiver r, all its upstream neighbors, and the source s. The set £ is 
initialized as the set of edges incoming to r. Hence Q = {V,£). The initial set of Tirv are the IRVs of the 
incoming edges of r, i.e., a set of distinct columns of the C x C identity matrix. The state flag STATE(New- 
Edge) is initialized to be "False". 

• Step B: For each node v s in V, call function FindN ewEdge{v) (Step C). If 

- STATE(New-Edge) is "True", set STATE(New-Edge) be "False" and repeat the loop of Step B. 

- STATE(New-Edge) is "False", go to Step E. 

• Step C: (Function FindEdge{v)) Let ei, . . . , be the outgoing edges of v in Q. If {t'(ei), . . . , t'(ed)} from 
ZiRV has 

- rank 1, step-back and continue the loop in Step B. 
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- rank greater than 1, for each candidate incoming edge of v, say e = {u,v), if e ^ £, call function 
CheckIRV{v,e) (Step D). Step-back and continue the loop in Step B. 
• Step D: (Function CheckIRV{v, e)) Use U to compute the IRV of e as t'(e) = /3(e, v, ej)t'(ej). Check 

whether < t'(e) > is in Tirv- If so, 

1) Set STATE(New-Edge) be "True". 

2) If u V, add u to V. 

3) Add e = e(u, u) to £. 

4) Based on 7^, update X/ijv from ^ = (V,f)fl 
Step-back to the loop in Step C. 

. Step E: End FIND-TOPO. 

If IiRv contains all edge IRVs which is supported by Theorem [8} we show correctness of FIND-TOPO as: 
Theorem 10: With a probability 1 - 0{\og^{\£\)\£\^\V\) / q, FIND-TOPO recovers the accurate topology by 
performing ©(log^d^ |)|<5|^| V|C) operations over ¥q. 

Before the proof of Theorem [10] we need the following lemma, which shows that with high probability function 
CheckIRV{v, e) accept an edge e if and only if e is actually in the network Q. 
Lemma 11: 1) If edge e = {u, v) exists in Q, < t'(e) > is in Ijrv, {ei, • • • , e^} are exactly all the outgoing 
edges of V in C/ and t'(ei) = t'(ei) for i = 1,2, d, function CheckIRV{v, e) accepts e as a new edge in 
£ with a probability 1. 

2) If edge e does not exist in Q, function CheckIRV{v, e) accepts e as a new edge in £ with a probabiUty 
O{log\\£\)\£\')/q. 
Proof: 

1) Under the conditions we have t'(e) = Yl'j=i f^i^i ^i)^'(^j) — *'(^) ^^id will be accepted. 

2) If e does not exist in Q, the coding coefficients {I3{e,v,ej) : j = 1, . . . , d} are not used. Hence from the 
perspective of any element < h > in Ijrv, Yl'j=i Pi^i ^jWi^j) ^^i independently and uniformly chosen 
vector in the span of the vectors {t'(ej) : j G {1, . . . , d}}. Since CheckIRV{v, e) is called only if the rank of 
{t'(ej) : j G {1, . . . , d}} is no less than 2, so that t'(e) G< h > with a probabiUty at most 1/q. Since FIND- 
IRV in Stage I needs at most t = 0{log{\£\)\£\) source generation^ ^irv has size at most C)(log^(|<S|)|iSp). 
Using the Union Bound EH < t'(e(u, v,i)) > is in Ijrv with a probability 0{log'^{\£\)\£\^ /q). □ 

Then we have: 

^ The reason that Imv needs to be updated is that: when e is found as a new edge in Q, the IRVs of the edges upstream of e in will 
change. 

'As pointed out in Remarlc 2 after Theorem 8 
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Proof of Theorem 10; Note that if no errors occur, Step B can find at most \£\ edges, eacii time of finding a 
new edge of Step B needs at most |V| invocations of Step C (once for each node), and each invocation of Step C 
results in at most \£\ invocations of Step D. Thus Step D can be invoked at most |<f P|V| times, and Lemma 



11 



demonstrates that each invocation results in an error with a probability at most ©(log^di^Dli^p/g). Note further 
that this is the only possible error event. Hence by the Union Bound OTl the probability that FIND-TOPO results 
in an erroneous reconstruction of Q is ©(log^dif |)|£^|^|V|)/g. Also, each computation of Step D takes at most 
0(log^(|(?|)|<5pC) finite field comparisons to determine membership of < t'(e) > in Iirv- Hence, given that 
the bound on the number of invocations of Step D and that this can be verified to be the most computationally 
expensive step, the running-time of FIND-TOPO is 0(log^(|(?|)|<5|^| V|C) operations over ¥q. 

Finally, we note that Q is acyclic and the assumption that Ijrv contains {< t'(e) >: e G <S}. Hence conditioning 
on no incorrect edges being accepted, for each invocation of Step B, unless Q = Q, there exists an edge e such that 
all edges e' downstream of e in ^ are in £, which implies all the corresponding t'(e')s are correctly computed. 



Thus by Lemma 11 edge e is accepted into 8 by function CheckIRV{v, e) with a probability 1. Hence, each edge 



actually in Q also eventually ends up in Q, and FIND-TOPO terminates. □ 

V. Error localization for RLNC 

As previous works ( ||8l, |fT3]| . |[T4l . under RLNC the receiver r must know the network topology and local 
random coding coefficients to locate network errors. Thus in this section receiver r is assumed to know the IRVs 



of each edge, which can follow topology estimation algorithms in Section IV or the network design as a priori. 



A. Locating adversarial errors under RLNC 

In this subsection we demonstrate how to detect the edges in the network where the adversary injects errors. 
Since the IRV is the fingerprint of the corresponding edge, detecting the error edges thus becomes an equivalent 
mathematical problem which detects the IRVs in the error matrix E. Our technique is based on the fact that when 



the edges are flow-independent (see the definition in Section II-C for details) enough to each other, the IRV of each 
error edge is not erasable from the column space of the error matrix E, i.e., E. 
Assumptions and justifications: 

1) Each internal node has out-degree at least 2z. Since Q is acyclic, it implies that every set of 2z edges in Q are 



flow-independent. While this assumption seems strong, we demonstrate in Theorem 13 that such a condition 
is necessary for r to identify the locations of z corrupted edges. 
2) At most z edges in Z suffer errors, i.e., {e : e G <?,z(e) 0} = Z and \Z\ < z. When 2z + 1 < C, 



network-error-correcting codes (see Section \ll-G\ for details) are used so that the source message X (and thus 
the error matrix E) is provably decodable. 
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Then we have: 

. ALGORITHM IV, LOCATE-ADVERSARY-RLNC: The algorithm is to locate the network adversaries under 
RLNC. 

• The input is the error matrix E and {t'(e) : e ^ £}. The output is a set of edges Z'. 

• Step A: Compute rank(£')= ry. Let {ei,e2, ...,6^} be a set of independent columns of E. 

• Step B: For i = 1,2, ...,r], find a set of edges Zi with minimal cardinality such that e; is in the column space 
of the corresponding impulse response matrix T'{Zi). 

• Step C: Output Z' = Ujg[i 

. Step D: End LOCATE-ADVERSARY-RLNC. 

We show that with high probability LOCATE-ADVERSARY-RLNC finds the location of edges with adversarial 
errors. 

Theorem 12: With a probabiUty at least 1 - |<?|('|J) /q the solution of LOCATE-ADVERSARY-RLNC results 
in Z' = Z. 

Proof: Note that Assumption 1), with high probability, gives a similar statement about the rank of the corresponding 
IRVs. Using the Union Bound ll3ll on the result of Lemma T]2 gives us the result that any 2Z IRVs are independent 
with a probability at least 1 — \£\ /q. We henceforth assume it happens in the following. 
First of all, since each ei is in T'{Z), we have \Zi\ < z for each i = 1, 2, -q. 

We claim that for each i G {1,2,..., r/}, Zi must be a subset of Z. If not, say e G Z{i) is not in Z. By the 
definition of LOCATE-ADVERSARY-RLNC, t'(e) is in the span of the columns of T'{Z) and T'{Zi - e). Thus 
a non-trivial combination of the at most 22: — 1 IRVs result in t'(e). It contradicts that any 2z IRVs are linearly 
independent. 

We prove next that for any edge e ^ Z on which the adversary injects a non-zero error, LOCATE-ADVERSARY- 
RLNC outputs at least one Zi such that e G Zi. Without loss of generality, let e be the first edge in Z. Then 
E = T'{Z)Z and the first row of Z is nonzero. Since any z IRVs are independent, T'{Z) is of full column rank. 
Then for any r/ independent columns in E there must be at least one, say ei, such that the IRV t'(e) has nonzero 
contribution to it. That is, ei = T'{Z){ci,C2, ■..,Cz f with ci / 0. Hence running LOCATE-ADVERSARY-RLNC 
on ei will find t'(e) and include the corresponding edge e into Zi. Otherwise, t'(e) is in the space of T'{Z — e, Zi), 
which contradicts that any 2z IRVs are linearly independent. □ 



We now show matching converses for Theorem [T2| In particular, we demonstrate in Theorem [13] that Assump- 
tion [T]), i.e., that any 2z edges are flow-independent, is necessary. 

Theorem 13: For linear network coding, any z corrupted edges are detectable if and only if any 2z edges are 
flow-independent. 
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Proof: The "if" direction is a corollary of Theorem [12] For the "only if" direction, suppose there exist 2z edges such 
that they are not flow-independent. Then the corresponding IRVs cannot be linearly independent by Lemma TJT 



Then there must exist a partition of these 2z edges into two edge sets Zi and Z2 such that \Zi\ = z and \Z2\ = z 
and T'{Z\) n T'{Z2) 7^ {0}, i.e., the spanning spaces of the corresponding IRVs in the two sets intersect non- 
trivially. Then the adversary can choose to corrupt Zi and inject errors Z in a manner such that the columns of 
T'{Zi)Z are in T'{Z2). This means r cannot distinguish whether the errors are from Zi or Z2. □ 



Theorem 13 deals with the case that any z edges can be corrupted. If only some sets of edges are candidates for 
adversarial action (for instance the set of outgoing edges from some "vulnerable" nodes) we obtain the following 
corollary. 

Corollary 14: Let S = {Zi, Z2, ...jZi} be disjoint sets of edges such that exactly one of them is controlled by 
an adversary. Then r can detect which edge set is controlled by the adversary if and only if any two sets Zi and 
Zj in S are flow-independent. 

Note: The flow-independence between edge-sets Zi and Zj in S does not require the edges within either of Zi 
or Zj to be flow-independent. It merely requires that flow-rank(Zj) + flow-rank(Zj) = flow-rank(2^j ^ ^j)- 

Note that running LOCATE-ADVERSARY-RLNC might require checking all the (^) subsets of edges in the 
network - this is exponential in z. We now demonstrate that for networks performing RLNC, the task of locating 
the set of adversarial edges is in fact computationally intractable even when the receiver knows the topology and 
local encoding coefficients in advance. 

Theorem 15: For RLNC, if knowing the network Q and all local coding coefficients allows the receiver r 
correctly locating all adversarial locations in time polynomial in network parameters, NCPRLC (see the definition 



in Section II-H for details) can be solved in time polynomial in problem parameters. 

Proof: Given a NCPRLC instance {H, z,e), as shown in Figure [Sj we construct a network with edges to receiver 
r and I2 edges to node u. 

Since H is a matrix chosen uniformly at random over ¥g, it corresponds to a RLNC, where each column of H 
corresponds to an IRV of an incoming edge of u. 

Assume the adversary corrupts z incoming edges of u. Adversary can choose the errors Z such that each column 
of = T'{Z)Z equals e. In the mean time E is all the information about the adversarial behavior known by r 
under RLNC. Any algorithm that outputs the corrupted set Z must satisfy e G T'{Z) and \Z\ < z. Once Z is 
found, r actually solves the NCPRLC instance {H,z,e). □ 

B. Locating random errors under RLNC 



We now consider the problem of finding the set of edges Z that experience random errors (see Section II-E for 



details). Since T'{Z) = T' {Ext{Z)) (see the definition of Ext{Z) in Section III-A for reference), the receiver can 
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not distinguish whether the errors are from Z or Ext{Z). So rather than finding Z, we provide a computationally 
tractable algorithm to locate Ext{Z), a proxy for Z. The algorithm that finds Ext{Z) is as follows: 

• Algorithm V, LOCATE-RANDOM-RLNC: Under RLNC, the algorithm is to locate the edges in the network 
suffering random errors. 

• The input is the matrix Y received by r and {t'(e) : e G £}. The output is an edge set Z' initialized as an 
empty set. 

• Step A: Compute Er as the Step A of Algorithm II: FIND-IRV. 

• Step B: Check for each edge e whether its IRV t'(e) lies in Er. If so, the edge e is added into Z' . 
. Step C: End LOCATE-RANDOM-RLNC. 

The correctness of LOCATE-RANDOM-RLNC is followed. 

Theorem 16: If z is no more than C — 1, Z' = Ext{Z) with a probability at least 1 — 3\£\'^/q — 2C^/(n — C). 
The computational complexity is 0{\£\C'^) operations over F^. 

Proof: Lemma [Tpl and Lemma |9] implies that Er = T'{Z)) = T'{Ext{Z)) with a probability at least l-2\£\/q- 
2CV(n - C). It implies Ext{Z) C Z'. 

For the other direction, using the Union Bound IIBTI over all \£\ edges on Lemma [Tp} with a probability at least 
1 — \£\'^/q, for any edge e Ext{Z), t'(e) is not in Er. In the end we have Ext{Z) = Z' with a probability at 
least 1 - 3|£:|7g - 2C^/{n - C). 

For each IRV t'(e), it cost at most operations over Fy to check whether it is in Er. Then the total complexity 
of LOCATE-RANDOM-RLNC h 0{\£\C'^) operations over Fg. □ 

Part II: Design Network Coding for Network Tomography 
VI. Network Reed-Solomon Coding (NRSC) 

A. Motivations 

In part I, under random linear network coding (RLNC), network tomography is studied for both adversarial and 



random error models (see Section |II-E| for the definition of error models). For random error model the schemes 
for both static topology estimation and error localization can be done in polynomial time, while the schemes 
for adversarial error model all cost exponential time. Moreover, under RLNC localizing adversarial errors is 
computational intractable (see Theorem [TS] ) and requires the knowledge of network topology, whose estimation 
algorithm also costs exponential time. 

In this section network Reed-Solomon Coding (NRSC) is proposed to improve the tomographic performance 
(specially for the adversarial error model), and meanwhile preserving the key advantages of RLNC. To be concrete, 
NRSC has the following features: 
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t'(e) = i;ig[i,4] ftt'(ei) 
V 



t'(ei) t'{e.,) t'ie;) t'ie^) 
Fig. 10. The IRV of e is a linear combination of the IRVs of ei, 62, 63 and 64. 



Low implementation complexity. The proposed NRSC is a linear network coding scheme (see Section II-D for 
details), and can be implemented in a distribute and efficient manner where each network node only needs to 
know the node-IDs of its adjacent neighbors. Thus once an edge (or node) has left or come, only its adjacent 
neighbors need to adjust the coding coefficients. 

High throughput. The capacity of multicast is achieved with high probability. 
NRSC aids tomography in the following two aspects: 
- Computational efficiency. For the adversarial error model, the receiver under NRSC can locate a number 



of adversarial errors that match a corresponding tomographic upper bound (see Theorem 13 for details) 
in a computationally efficient manner. For the random error model, a lightweight topology estimation 
algorithm is provided under NRSC. 
- The robustness for dynamic networks. For adversarial (and random) error localization, the algorithms under 
NRSC do not require the priori knowledge of the network topology and thus are robust against edge and 
node updating. For topology estimation in the random error model, the lightweight algorithm under 
NRSC fits dynamic networks better than the one under RLNC. 

B. Overview of NRSC 

In NRSC, in addition to an IRV each edge e is also assigned a virtual IRVt"{e). This virtual IRV is a deterministic 
function of the node-IDs of the header and tail of e (and hence is known to them). Further, each node in an NRSC 
(say node v in Figure 10) carefully chooses its coding coefficients {e.g., ...,^4} at node v in Figure 10, where 
Pi = (3{e,v,ei) for i = 1,...,4) such that the virtual IRVs of edges entering and leaving v satisfy the same linear 
relationship as the IRVs (in the case of Figure 10, t"(e) = /3it"(ei) + ... + fiit"{ei)). In other words, under NRSC 
every network node makes "local contribution" to force edge IRVs equaling the corresponding VIRVs. And the 



object can be achieved if and only if a connectivity requirement is satisfied (see Corollary 22 for details) 



At a high level, we compare the tomography performance between RLNC and NRSC in the following: 

Note that under RLNC, the error localization algorithms in previous works fSl, fT2l, flil and this paper require the priori knowledge of 
the network topology. However, the topology estimation under RLNC costs exponential time for the networks with adversarial errors, and 
costs polynomial time for the static networks with random errors. 
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Computational efficiency. Under RLNC, each edge IRV is randomly chosen from the Unear subspace spanned 
by the down-streaming edge IRVs, resulting that locating network adversaries is as hard as NCPRLC (see 



Theorem 15 for details). Under NRSC, VIRVs (and then IRVs) are smartly chosen such that locating adversaries 
can be done in an efficient manner. 
• The robustness for dynamic networks. Under RLNC each edge (say e) updating results into IRV updating 
for all up-streaming (of e) edges. Under NRSC, once an edge is update, its adjacent header would adjust its 
local coefficients to stop IRV updating. For instance consider the subnetwork in Figure 10. Once edge ei is 
disconnected, v would change the local coefficients such that the IRV of e still equals to the VIRV of e. Thus 
no updating is needed for the up-streaming nodes of v. 

C. Node and edge IDs 

Each pair of nodes (n, v) inV ®V has an ID id{u, v) chosen independently and uniformly at random from Fg. 
These IDs can be broadcast by the source using digital signature schemes such as RSA ||29| , or outputted by a 
pseudorandom hash functiorp](with input as a pair of nodes) such as AES |33| that can be accessed by all parties. 
Thus this set of |Vp IDs is publicly known a priori to all parties (including the adversaries), even though they 
may not know which nodes and edges are actually in the network. 

The following lemma shows that each node pair has a distinct ID with high probability: 

Lemma 17: With a probability at least 1 — |V|'^/g, for any {u,v) ^ {u',v') in £, id{u,v) ^ id{u',v'). 
Proof: For any {u,v) ^ {u',v'), id{u,v) = id{u',v') with a probability at most l/q. Since V x V has size |Vp, 
there are at most ('^' ) < |V|^ distinct pairs in V x V. Using Union Bound |[3ll over all these pairs the lemma is 
true with a probability at least 1 — |V|'^/(7. □ 

For each edge e{u,v) G £ the ID of e is id{e) = id{u,v). Thus the ID of edge e{u,v) can be figured out by 
both u and v if they know their adjacent neighbors. A direct corollary of Lemma [T7] is that each edge has a distinct 
ID with high probability. We henceforth assume that this is indeed the case. 

Note that for scenario where parallel edges are allowed, we assume some pairs of nodes has multiple IDs, the 
z'th of which is the ID of the i'th edge between them. 

For each edge e the virtual impulse response vector (VIRV) is t"(e) G F^, which is [id{e), {id{e))'^, (i(i(e))*-^]^. 
For any set of edges Z with size z, the virtual impulse-response-matrix (VIRM) is T"{Z) G F^^^, with the columns 
comprised of {t"(e),e G Z}. 

For the ease of notation we also defined a dimension-parameterized VIRV as t"(e, i) = [id{e), {id{e))'^, (i(i(e))*]^. 
For any set of edges Z with size z, the corresponding VIRM is T"{Z,i) G F*^^, with the columns comprised of 



"Note that the randomness of the IDs is used in proving Lemma 17 and Theorem 18 which (the distinctness of node-pair IDs and the 



throughput of multicast) are polynomial time distinguishable. Thus pseudorandomness suffices 1321 . 
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{t"(e,i),e G Z}. Note that T"{Z,z) is a Vandemonde matrix and invertible when \Z\ = z and the edges in Z 
have distinct IDs. 



D. Code construction of NRSC 

We assume by default that the edges in £ have distinct IDs, which happens with a probability at least 1 — |V|^/g 



by Lemma 17 Recall that C is the capacity of the network, i.e., C = max-flow(s, r), and for ease of notation we 



assume that the source has exactly C outgoing edges and the receiver has C incoming edges (see Section |II-B| for 
details). 

The construction of NRSC is then as follows. 

Source encoder: Let Out(s) = {ei, 62, ec} be the outgoing edges of the source s and X e F^^" be the 
source message matrix. The source s computes M = r"(Out(s), C)~^X and sends the zth row of M as the packet 
over Cj. Note that X contains a known "header", say the C x C identity matrix over Fg, to indicate the network 
transform to the receiver. 

Network encoders: Let Out{v) = {ei,e2, •••,6^} be the outgoing edges of node v. For an incoming edge e of 
V, V computes b(e) = T"{Out{v),d)^^t"{e,d). For the coding coefficient /3(e,w,ej) from e via v to Cj, v sets 
I3{e,v,ei) to be the ith component of b(e) 

Receiver decoder: The receiver receives 

Y = TX, (8) 

where T e ^^xc indicated by the header of Y. If T is invertible the receiver can decode X correctly. 

Thus, similar to RLNC fH, NRSC can be implemented in a distributed manner given that each node knows 
its local topology, i.e., the adjacent neighbors. If an edge/node has been added/deleted, only local adjustments are 
needed. 



E. Optimal throughput for multicast scenario 

The theorem below shows that with high probability NRSC achieve the multicast capacity. 

Theorem 18: With a probability at least 1 — C\£\'^/q, receiver r can decode X correctly. 
Proof: Let X be the set of all random variables involved, i.e., X = {id{u,v), {u,v) ^ V V}. By default we 
assume that any polynomial mentioned in the proof has variables in X. 

Let detc = Iluevdet{u), where det{u) is the determinant of the matrix T"(Out(u), |Out(n)|) for node n G V. 
For each u £ V, since each component of T"(Out(u), |Out(n)|) is a polynomial of degree at most |Out(M)|, det{u) 
is a polynomial of degree at most |Out(n)p. Thus detc is a polynomial of degree at most J2ueV |Out('u)p < 
(E„evlOut(n)|)2 = |f|2. 
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Let T be the transform matrix from s to r defined in Equation ([8]l. We claim each element of detcT is a 
polynomial of degree at most \£\^. To see this, we first note that each component in det{u)T" {Out{u) , |Out(ii)|)~^ 
is a polynomial of degree at most |Out(u)p — |Out(n)| (see Cramer's rule in [34]). Thus in the construction 
of NRSC each local coding coefficient (3{e,u,e') used by n G V is Poly((, u,e')/det{u), where Poly(^f,,u.e') is 
a polynomial of degree at most |Out(u)p. Each element in T can be expressed as where /3(a) = 

n(e,«,e')ea/5(e, u, e') and a is a path from s to r (see [5 1 for references). Thus each element in T can be expressed 
as Polya/ ijiueadet{u)), where Polya = '^(e,u,e')eaPoly(e,u,e')- Thus Polya is a polynomial of degree at most 
Susq |Out(u)p < J2ueV |Out(ii)p < Since no node appears twice in a path of an acyclic network, detc 
is divisible by Ilu<:adet{u) for each path a. Thus detc"^^ Polya{X) / {Ilueadet{u)) is a polynomial of degree at 
most \£\'^. This completes the proof of the claim that each element of detcT is a polynomial of degree at most 
\£\\ 

Now we prove detcT is invertible with high probability. The determinant of detcT is denoted as detr, which 
is therefore a polynomial of degree at most |<S|'^C. 

Without loss of generality let {Vi,V2, ■■■,Vc} be the edge-disjoint paths from the source s to the receiver r. We 
first prove that detr is a nonzero polynomial, i.e., that there exists an evaluation of X such that detc 7^ {i.e., for 
each u G V no two edges in Out(?x) have the same ID) and the source can transmit C linearly independent packets 
via ^1,7^2, --^Vc- 

The evaluation of X is described in the following: First, assume each edge has a distinct ID. Second, since 
the fth outgoing edge of the source sends the fth row of M = T"(Out(s), C)~^X, the paths Vi,V2, ■■■,'Pc carry 
linearly independent packets on their initial edges. Third, the IDs of edges in Vi are all changed to be the ID of 
the first edge in Vi. Note that this operation preserves the property that for each n G V no two edges in Out(n) 
have the same ID {i.e., detc / 0). Finally in fact the network uses routing to transmit the C independent source 
packets via Pi,P2, ■■■,'Pc- 

Thus under the above evaluation of X the matrix detcT is invertible and therefore dety ^ 0. Using Schwartz- 
Zippel Lemma 1301 deU 7^ and thus receiver r can decode X with a probability at least 1 — \£\'^C /q over all 
the evaluations of X. □ 

Thus if the network has k receivers, using the Union Bound llBlTl on all receivers we conclude with a probability 
at least 1 - k\£\'^C/q each receiver can decode X. 

Therefor the techniques over RLNC in multicast scenario can be directly moved into NRSC. For instance using 
network error-correcting codes ijTOllfTTI NRSC are able to attain the optimal throughput for multicast with network 
errors. 
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7^ IRVs under NRSC 



Following above Theorem 1 8 the relations between IRVs and network structure can be shown the same as those 
for RLNC (see Lemma [T] for details). To be concrete, for networks performing NRSC we have: 

Lemma 19: 1) The rank of the impulse response matrix T'{Z) of an edge set Z with flow -rank z is at most 
z. 

2) The IRVs of flow-independent edges are linear independent with a probability at least I — C\£\^ / q. 
Proof: The proof is similar to the proof of Lemma [T] □ 



Note that for random error model (see Section |II-E| for details), all tomography schemes under RLNC are based 
on Lemma [T] Thus such schemes still work under NRSC. 

VII. Locating errors under NRSC 

In this section we show that the receivers in networks using NRSC are able to efficiently locate the network 
adversaries even without the knowledge of the network topology. The high level idea is that each column of error 



matrix plays the role of vector e for RS-DECODE(i/, e) (see Section II-I for details), where the columns of the 
Reed-Solomon parity-check matrix H comprise of the VIRVs of network edges. Thus the output of algorithm 
RS-DECODE(ff, v) locates the set of error edges. In the end of this section, without the priori knowledge of the 
network topology we provide an efficient algorithm which locates the edges suffering random errors. 
Assumptions and Justifications 

1) At most z edges in Z suffer errors, i.e., {e : e G <S, z(e) 0} = Z and \Z\ < z. When 2z + I < C, network 
error-correcting-codes (see Section II-G| for details) are used so that the source message X is provably 



decodable 



2) Each node in V — {r} has out-degree at least d = 2z. Note that Theorem 13 proves it is a necessary condition 
for locating z errors. 

Let the elements in V (8) V be indexed by {l,2,...,|Vp}. The parity check matrix i7 G Fg^'^' is defined as 
H = [hi, h2, hivp]. Here hj is the VIRV (with length d) of the ith element in V ® V. Then the adversarial error 
locating algorithm is: 

. ALGORITHM VI LOCATE-ADVERSARY-RS: The algorithm is to locate network adversarial errors for 
networks performing NRSC. 

• The input of the algorithm is the source matrix X, the parity-check matrix H, and the C xn matrix Y received 
by receiver r. The output of the algorithm is a set of edges Z' initialized as an empty set. 

• Step A: Compute Y(^iis,d) = T" {In{r) , d)Y and L = ^(ijs^d) — Xd, where Xd comprises of the first d rows of 
X. 
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• Step B: For each column of L, say v, compute b = RS-DECODE(//, v). If the ith component of b is nonzero, 

the ith node pair {u, v)mV0Vis added as an edge e = {u, v) into Z'. 
. Step C: End LOCATE-ADVERSARY-RS. 

Theorem 20: The edge set Z' output by LOCATE-ADVERSARY-RS equals actual error edge set Z. The 
computational complexity of LOCATE-ADVERSARY-RS 0{n\V\'^d). 

Before the proof we show the following key lemma when |Out(u)| > d for each node u G V — {r}. Recall that 
z(e) is the error packet injected on edge e. 

Lemma 21: If the source message matrix X equals 0, 

= I]t"(e,d)z(e). (9) 

Proof: We proceed inductively. Throughout the proof let Et be the set of edges satisfying the theorem, i.e., 
y{RS,d) = Eee£- *"(^' d)'z.{e) when z(e) = for all e G <S - Et- 
Step A: If 8t = In(r), the theorem is true by the definition. 

Step B: Since the network is acyclic, unless 8t = £, there must exist an edge e ^ £ — £t such that its adjacent 
outgoing edge set Out(e) is a subset of £t. Let Out(e) = {ei, 62, e/c} with A; > d. If only e suffers non- 
zero injected errors z(e), the output of e is z(e). Thus for each i E the output of is /3jz(e), where /3j 



is the ith component of b(e) = T"{Out{e),k) ^t"{e,k) (see Section VI-D for details). Since d < k, we have 
Eie[i,fc] /3it"(ei,rf) = t"{e,d). Since Out(e) C S^, Y(^Rs,d) = Eie[i,fc] c?)ftz(e) = t"{e,d)z{e). Therefore 
Equation ^ is true for the case where only e suffers non-zero injected error z(e). Since NRSC are linear codes, 
e can be added into £t. 

Step C: Since the network is acyclic and each node (or edge) in V (or £) is connected to r, we can repeat Step 
B until £t = £. □ 



Recall the definition of IRV in Section III- A we have Y = Xleef t'(c)z(e). Thus the following corollary is true 
for network satisfying |Out(n)| > d for each node n G V — {r}: 
Corollary 22: For each edge ee£, T' {In{r) , d)t' (e) = t"{e,d). 

For the case where no error happens in the network and the source s transmits the C x n message matrix X with 
C > d,hy Lemma 21 above we have ^(^^5^^) = X]ie[i c] ^"(^«) rf)x(ej), where ei is the ith edge of Out(s) and x(ej) 



is the ith row of M = T"{Out{s),C)-^X (see Section VLD for details). Thus Y(rsa) = T"{Out{s),d)M = Xa, 
where X^ is the matrix consisting of the first d rows of X. 
Then we have the corollary: 

Corollary 23: When the source message is X, ^(Rs^d) = + X]ee£: ^)^(^)- 
Then we can prove main theorem of this section as: 
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Proof of Theorem 20 Using Corollary 23 we have L = X]ee2 ^"(^' ^)^(^)- Since \Z\ = z < d/2, each column 
of L is a linear combination of at most d/2 columns of H. Additionally, since H is also a parity check matrix 
of a Reed-Solomon code, RS-DECODE correctly finds all the edges with nonzero injected errors, and therefore 
Z' = Z. For each column of L, RS-DECODE runs in time OdVpd). Thus the overall time complexity of the 
algorithm O {n\V\'^ d) . □ 
In the end of this section, under the condition that |Out(u)| > d for each node u G V — {r}, we show that NRSC 
enables the receiver r locate any z < d — 1 random errors without the priori knowledge of the network topology. 
The scheme is in the following: 

Locate random errors under NRSC: Once matrix L is computed by Step A of LOCATE-ADVERSARY-RS, 

for each {u,v) G V V check whether t"{{u,v),d) is in L {i.e., the column space of L). If so, e{u,v) is output 
as an error edge. Continue the loop for another node pair in V x V. 

By Corollary [22] and Equation L = T"{Z,d)Z, where the rows of Z comprise of {z(e) : e G Z}. By the 
proof of Lemma |9] we have Z has rank \Z\ with high probability. Thus for each edge e ^ Z, t"(e) G L. For any 
edge e' Z, since id{e') is different from any ID in {id{e) : e G Z}, t"(e', d) is linear independent to the columns 
of T"{Z, d). Thus t"(e', d) is not in L. 

VIII. Topology estimation for network with random errors under NRSC 
Under NRSC, the section provides a lightweight topology estimation algorithm for the random error model. The 



high level idea is that once a candidate IRV is collected using Algorithm II, FIND-IRV of Section IV-C the 



corresponding VIRV can be computed by Corollary 22 Using the VIRV the corresponding edge can be detected. 



Thus Algorithm III, FIND-TOPO is not involved, who requires FIND-TOPO recovering all IRV information. 



For estimating the entire network topology, all assumptions in Section IV-C are required here except for As- 
sumption [6j which assumes weak type common randomness. 

Note that if the network has strong connectivity and each edge suffers random error with non-negligible prob- 



ability, the algorithm for locating random errors shown in the end of Section VII can also detect the topology. 
The algorithm shown below only requires weak connectivity, i.e., each internal node has out-degree at least 2, as 
Assumption |2]) in Section IV-C 



. ALGORITHM VII FIND-TOPO-RS: Under NRSC, the algorithm is to estimate the network topology in 

the presence of random errors. 
• The input is {Y{i), i G [1, t\], which are the received matrix for source generation {1, 2, t}. The output is 

8' which is a set of edges initialized as an empty set. 
. Step A: For i G compute E{i)r as Step A in Algorithm II, FIND-IRV 
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• Step B: For any two of {E{i)r,i G [1,*]}, say E{i)r and E{j)r, compute the intersection E(i)r H E(j)r. If 
the intersection is a rank-one subspace < h >, goto Step C. Otherwise, continue the loop in the beginning of 
Step B. 

• Step C: Compute hi (and /i2) as the first (and second) component of T"(In(r), 2)h. For any node pair 
(n, f ) G V V, if the ratio /i2/^i equals id{u, v), add {u, v) as an edge into £'. Go back to continue the loop 
in the beginning of Step B. 

. Step D: End FIND-TOPO-RS. 

Let Q = {V,£) be the actual network topology, pc be the probability defined in Assumption |4]l of Section IV-C 
be 1 - (1 - z/q)[l - 2CV(n - C)] and p'^ be pc + 2ps + C\£\'^/q. Then the theorem is: 
Theorem 24: 1) With a probability at most |Vpi^/g, £' has an edge which is not in £. 
2) If edge e G Z{i) n Z{j), e ^ £' with a probability at least 1 — p'^. 
Proof: 

1) Consider node pair (u, G V (?) V which is not in £. Since id{u, v) is independent from the network coding 
coefficients used in Q and the random errors in each source generation, for any invocation of Step C the ratio 
/i2//ii is independent from id{u,v). Thus /i2//ii = id{u,v) with probability at most 1/q. Since there are at 
most t'^ invocations of Step C, using Union Bound OTl edge e(n, v) is accepted in £' with a probability at 
most t'^/q. Since there are at most |Vp node pairs, also by Union Bound lISTI £' has an edge which is not 
in £ with a probability at most |Vpt^/g. 

2) If e G Z{i) n Z{j), from the proof of Theorem [S] the intersection of E(i)r n E(j)r equals < t'(e) > with a 
probability at least 1 — p'a- Note that the difference between pa and p'^ comes from the difference between 
Lemma [T] (which is for RLNC) and Lemma [19] (which is for NRSC). Since each internal node has out-degree 



at least 2, from Corollary |22| we have r"(In(r), 2)t'(e) = t"(e,2) = [id{e) , {id{e)ff . It completes the 
proof. 

□ 

Remark 1: For estimating the failing topology (i.e., detecting the edges with errors), even Assumption [5]l of 



Section IV-C is not needed anymore, which requires each edge suffers random errors with a non-negligible 
probability. Once an edge e has random errors for multiple source generations, it can be detected with high 
probability. 

Remark 2: For the scenario while network edges (or nodes) suffer dynamic updating, FIND-TOPO-RS is more 
robust than the topology estimation algorithm under RLNC (see Section IV-C [ for details). The reason is that under 
RLNC the receiver must use algorithm FIND-IRV to recover all IRV information before proceed the topology 
estimation algorithm FIND-TOPO. Thus it requires the network unchanged during t = Q{log{\£\)\£\) source 
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generations (see Remark 2 after Theorem [8] for details). Under NRSC, for detecting edge e FIND-TOPO-RS only 
requires the network unchanged between two fails of e. 

IX. Conclusion and Future Work 

This work examines passive network tomography on networks performing linear network coding in the presence 
of network errors. We consider both random and adversarial errors. In part I, under random linear network coding 
(RLNC) we give characterizations of when it is possible to find the topology, and thence the locations of the 
network errors. Under RLNC, many of the algorithms we provide have polynomial time computational complexity 
in the network size; for those that are not efficient, we prove intractability by showing reductions to computationally 
hard problems. In part II, we design network Reed-Solomon coding (NRSC) to address the undesirable tomography 
capabilities of RLNC under some (especially adversarial error) settings, and yet preserving the key advantages of 
RLNC. 

Possible future work can proceed in many directions. 

1) Adversarial nodes cannot be located exactly in general networks. For instance, when the adversarial node u 
pretends it is receiving erroneous transmissions from its upstream neighbor v, it is impossible for the receiver 
to determine whether the error is located at u or at w. Hence tomography schemes that approximately locate 
adversarial nodes are hoped for. 

2) Tomography schemes that approximately estimate the network topology in the presence of adversarial errors 
are hoped for. 

3) The question of designing a network coding scheme that enables efficient topology estimation in the presence 
of adversarial errors, and yet preserves key advantages of RLNC (low-complexity rate-optimal distributed 
coding), is open. 
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X. Appendix 

A. Network erasure model 

An erasure on edge e means that the packet x(e) carried by e is treated as an all-zeroes length-n vector over ¥q 
by the node receiving x(e), i.e., the injected erroneous packet z(e) equals — x(e). Two network erasure models are 
considered: 

1) Random erasures: Every edge e in 8 experiences random erasures independently. 

2) Adversarial erasures: The edges that suffer erasures are adversarially chosen. 



B. Topology estimation for network erasures under RLNC 



Since adversarial erasures is a weaker attack model than adversarial errors, the results in Section |IV-B| can be 
directly applied to the case of adversarial erasures. 

The topology estimation scheme for random erasures is slightly different from that for random errors. The 
difference comes from the fact that in the random error model the injected errors in Z are chosen at random, while 
in the random erasure model the injected errors are exactly the negative of the messages transferred. Thus Lemma |9] 



for random error model is not always true for the random erasure model. Hence we need the Lemma 25 below as 
an alternative. 

Let Z be the set of edges suffering erasures and \Z\ = z. Let t(e) G F^^*^ be the global encoding vectors |[35l of 
edge e, i.e., the packet carried by e is t(e)X when no enms or erasures happen in the network. Let T{Z) G F^^*^ 
be the matrix whose rows comprise of {t(e), e G Z}. Recall that E = T'{Z)Z (as defined in Equation (js])), where 
the rows of Z comprise of {z(e) : e G Z}, i.e., {— x(e) : e G Z}. Then we have: 

Lemma 25: If the source has max-flow z to the headers of the edges in Z, with probability at least 1 — \£\/q, 
the matrix Z of injected errors has full row rank z and thus E = T'{Z). 

Proof: Since the network is directed and acyclic, for ease of analysis we impose an partial order on the edges of 
Z = {ei, 62, e^}. In particular, for any j > i, Cj can not be upstream of Cj. 
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Similarly to Lemma [TJ if the source has max-flow z to the headers of the edges in Z, T{Z) has full row rank 
z with a probability at least 1 — \£\/q under RLNC. 

The error corresponding to the erasure on ei equals — t(ei)X. The packet traversing 62 may be effected by the 
first erasure. Hence the error corresponding to the erasure on 62 equals — (t(e2) — ai^2't{ei))X = —i{e2)X, where 
0-1,2 = ci,2 is the unit effect from e\ to 62- In general, the error corresponding to the erasure on Cj equals 

-t{e^)X = -(t(ei)- Yl cj,.t(e,))^ 

j=l,2,...,i~l 

= -(t(ei)- Yl HM^i))^^ 

j=l,2,..,i-l 

where cj^i is the unit effect from ej to e^. 

Thus Z = -AT{Z)X, where A £ F^^'^ and the (i, i)'th element of A equal -a{j, i) with j < if j > i, I if 
i = j. Then A is invertible. If T{Z) has full row rank z and X has an invertible C xC sub-matrix (for instance, the 
header corresponding to the identity matrix used in RLNC), Z has full row rank z. Thus we have that E = T'{Z). 
□ 

To estimate the topology for random erasures under RLNC we use a two-stage scheme similar to that in 



Section IV-C That is, stage 1 is used for collecting IRV information from multiple source generations, and stage 
2 is used for constructing the topology by the IRV information collected in stage 1. 

For stage 1, recall that the identity matrix Ic is the header of the source matrix X{i), where i denotes the index 
of source generation. Thus the header of Y{i) is Y{i)h = T — T'{Z{i))A{i)T{Z{i)), where A{i) is defined in 



the proof of Lemma 25 For ii ^ i2, the difference of the headers Y{ii)fi — Y{i2)h is T' {Z{i2))A{i2)T{Z{i2)) — 
T'{Z{ii))A{ii)T{Z{ii)). Since both A{ii) and A{i2) are invertible matrixes, the column space of Y{ii)fi — Y{i2)h 
equals T'{Z{ii) U Z{i2)) if T{Z{ii) U Z{i2)) has rank \Z{ii)\ + \Z{i2)\. Thus, Y{ii)h - Y{i2)h could replace 



E{i)r in FIND-IRV to provide the information of IRVs. Thus with the same assumptions as those in Section IV-C[ 
we can use FIND-IRV to collect the IRV information {E{i)r is replaced by Y{ii)h — Y{i2)h for a pair (^1,^2) £ 

For stage 2, we can directly use FIND-TOPO to recover the topology of the network. 

C. Locating erasures under RLNC 

The algorithm LOCATE-RANDOM-RLNC can be also used for locating network erasures (both random and 
adversarial), resulting in polynomial-time algorithms. 



To locate random erasures, Lemma 25 proves that when the source has max-flow |2^| to the headers of Z who 
suffer erasures, rank{Z) = z and E = T'{Z). Thus LOCATE-RANDOM-RLNC can be used to locate erasures 
in the network, with using E in Step B instead of E^.. 
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To use the efficient algorithm LOCATE-RANDOM-RLNC to locate adversarial erasures, by Lemma [25 



It IS 



required that any node has in-degree at least z. Otherwise, the high complexity algorithm LOCATE-ADVERSARY- 
RLNC can be used to find the locations of the adversarial erasures. 

Remark: The algorithm for locating erasures can also be used for locating edges experiencing problematic delays. 
Let Yd G F^^ " be the delayed packet matrix received by r. Then r can locate the delayed edges by treating Yd as 
the erasure matrix E and then using the scheme for locating network erasures. 



